50 research outputs found

    MAGNETO: cell type marker panel generator from single-cell transcriptomic data

    Get PDF
    Single-cell RNA sequencing experiments produce data useful to identify different cell types, including uncharacterized and rare ones. This enables us to study the specific functional roles of these cells in different microenvironments and contexts. After identifying a (novel) cell type of interest, it is essential to build succinct marker panels, composed of a few genes referring to cell surface proteins and clusters of differentiation molecules, able to discriminate the desired cells from the other cell populations. In this work, we propose a fully-automatic framework called MAGNETO, which can help construct optimal marker panels starting from a single-cell gene expression matrix and a cell type identity for each cell. MAGNETO builds effective marker panels solving a tailored bi-objective optimization problem, where the first objective regards the identification of the genes able to isolate a specific cell type, while the second conflicting objective concerns the minimization of the total number of genes included in the panel. Our results on three public datasets show that MAGNETO can identify marker panels that identify the cell populations of interest better than state-of-the-art approaches. Finally, by fine-tuning MAGNETO, our results demonstrate that it is possible to obtain marker panels with different specificity levels

    FiCoS: A fine-grained and coarse-grained GPU-powered deterministic simulator for biochemical networks.

    Get PDF
    Mathematical models of biochemical networks can largely facilitate the comprehension of the mechanisms at the basis of cellular processes, as well as the formulation of hypotheses that can be tested by means of targeted laboratory experiments. However, two issues might hamper the achievement of fruitful outcomes. On the one hand, detailed mechanistic models can involve hundreds or thousands of molecular species and their intermediate complexes, as well as hundreds or thousands of chemical reactions, a situation generally occurring in rule-based modeling. On the other hand, the computational analysis of a model typically requires the execution of a large number of simulations for its calibration, or to test the effect of perturbations. As a consequence, the computational capabilities of modern Central Processing Units can be easily overtaken, possibly making the modeling of biochemical networks a worthless or ineffective effort. To the aim of overcoming the limitations of the current state-of-the-art simulation approaches, we present in this paper FiCoS, a novel "black-box" deterministic simulator that effectively realizes both a fine-grained and a coarse-grained parallelization on Graphics Processing Units. In particular, FiCoS exploits two different integration methods, namely, the Dormand-Prince and the Radau IIA, to efficiently solve both non-stiff and stiff systems of coupled Ordinary Differential Equations. We tested the performance of FiCoS against different deterministic simulators, by considering models of increasing size and by running analyses with increasing computational demands. FiCoS was able to dramatically speedup the computations up to 855×, showing to be a promising solution for the simulation and analysis of large-scale models of complex biological processes

    Analysis of single-cell RNA sequencing data based on autoencoders

    Get PDF
    Abstract: Background: Single-cell RNA sequencing (scRNA-Seq) experiments are gaining ground to study the molecular processes that drive normal development as well as the onset of different pathologies. Finding an effective and efficient low-dimensional representation of the data is one of the most important steps in the downstream analysis of scRNA-Seq data, as it could provide a better identification of known or putatively novel cell-types. Another step that still poses a challenge is the integration of different scRNA-Seq datasets. Though standard computational pipelines to gain knowledge from scRNA-Seq data exist, a further improvement could be achieved by means of machine learning approaches. Results: Autoencoders (AEs) have been effectively used to capture the non-linearities among gene interactions of scRNA-Seq data, so that the deployment of AE-based tools might represent the way forward in this context. We introduce here scAEspy, a unifying tool that embodies: (1) four of the most advanced AEs, (2) two novel AEs that we developed on purpose, (3) different loss functions. We show that scAEspy can be coupled with various batch-effect removal tools to integrate data by different scRNA-Seq platforms, in order to better identify the cell-types. We benchmarked scAEspy against the most used batch-effect removal tools, showing that our AE-based strategies outperform the existing solutions. Conclusions: scAEspy is a user-friendly tool that enables using the most recent and promising AEs to analyse scRNA-Seq data by only setting up two user-defined parameters. Thanks to its modularity, scAEspy can be easily extended to accommodate new AEs to further improve the downstream analysis of scRNA-Seq data. Considering the relevant results we achieved, scAEspy can be considered as a starting point to build a more comprehensive toolkit designed to integrate multi single-cell omics

    A novel framework for MR image segmentation and quantification by using MedGA.

    Get PDF
    BACKGROUND AND OBJECTIVES: Image segmentation represents one of the most challenging issues in medical image analysis to distinguish among different adjacent tissues in a body part. In this context, appropriate image pre-processing tools can improve the result accuracy achieved by computer-assisted segmentation methods. Taking into consideration images with a bimodal intensity distribution, image binarization can be used to classify the input pictorial data into two classes, given a threshold intensity value. Unfortunately, adaptive thresholding techniques for two-class segmentation work properly only for images characterized by bimodal histograms. We aim at overcoming these limitations and automatically determining a suitable optimal threshold for bimodal Magnetic Resonance (MR) images, by designing an intelligent image analysis framework tailored to effectively assist the physicians during their decision-making tasks. METHODS: In this work, we present a novel evolutionary framework for image enhancement, automatic global thresholding, and segmentation, which is here applied to different clinical scenarios involving bimodal MR image analysis: (i) uterine fibroid segmentation in MR guided Focused Ultrasound Surgery, and (ii) brain metastatic cancer segmentation in neuro-radiosurgery therapy. Our framework exploits MedGA as a pre-processing stage. MedGA is an image enhancement method based on Genetic Algorithms that improves the threshold selection, obtained by the efficient Iterative Optimal Threshold Selection algorithm, between the underlying sub-distributions in a nearly bimodal histogram. RESULTS: The results achieved by the proposed evolutionary framework were quantitatively evaluated, showing that the use of MedGA as a pre-processing stage outperforms the conventional image enhancement methods (i.e., histogram equalization, bi-histogram equalization, Gamma transformation, and sigmoid transformation), in terms of both MR image enhancement and segmentation evaluation metrics. CONCLUSIONS: Thanks to this framework, MR image segmentation accuracy is considerably increased, allowing for measurement repeatability in clinical workflows. The proposed computational solution could be well-suited for other clinical contexts requiring MR image analysis and segmentation, aiming at providing useful insights for differential diagnosis and prognosis

    ACDC: Automated Cell Detection and Counting for Time-Lapse Fluorescence Microscopy.

    Get PDF
    Advances in microscopy imaging technologies have enabled the visualization of live-cell dynamic processes using time-lapse microscopy imaging. However, modern methods exhibit several limitations related to the training phases and to time constraints, hindering their application in the laboratory practice. In this work, we present a novel method, named Automated Cell Detection and Counting (ACDC), designed for activity detection of fluorescent labeled cell nuclei in time-lapse microscopy. ACDC overcomes the limitations of the literature methods, by first applying bilateral filtering on the original image to smooth the input cell images while preserving edge sharpness, and then by exploiting the watershed transform and morphological filtering. Moreover, ACDC represents a feasible solution for the laboratory practice, as it can leverage multi-core architectures in computer clusters to efficiently handle large-scale imaging datasets. Indeed, our Parent-Workers implementation of ACDC allows to obtain up to a 3.7× speed-up compared to the sequential counterpart. ACDC was tested on two distinct cell imaging datasets to assess its accuracy and effectiveness on images with different characteristics. We achieved an accurate cell-count and nuclei segmentation without relying on large-scale annotated datasets, a result confirmed by the average Dice Similarity Coefficients of 76.84 and 88.64 and the Pearson coefficients of 0.99 and 0.96, calculated against the manual cell counting, on the two tested datasets

    GenHap: a novel computational method based on genetic algorithms for haplotype assembly.

    Get PDF
    BACKGROUND: In order to fully characterize the genome of an individual, the reconstruction of the two distinct copies of each chromosome, called haplotypes, is essential. The computational problem of inferring the full haplotype of a cell starting from read sequencing data is known as haplotype assembly, and consists in assigning all heterozygous Single Nucleotide Polymorphisms (SNPs) to exactly one of the two chromosomes. Indeed, the knowledge of complete haplotypes is generally more informative than analyzing single SNPs and plays a fundamental role in many medical applications. RESULTS: To reconstruct the two haplotypes, we addressed the weighted Minimum Error Correction (wMEC) problem, which is a successful approach for haplotype assembly. This NP-hard problem consists in computing the two haplotypes that partition the sequencing reads into two disjoint sub-sets, with the least number of corrections to the SNP values. To this aim, we propose here GenHap, a novel computational method for haplotype assembly based on Genetic Algorithms, yielding optimal solutions by means of a global search process. In order to evaluate the effectiveness of our approach, we run GenHap on two synthetic (yet realistic) datasets, based on the Roche/454 and PacBio RS II sequencing technologies. We compared the performance of GenHap against HapCol, an efficient state-of-the-art algorithm for haplotype phasing. Our results show that GenHap always obtains high accuracy solutions (in terms of haplotype error rate), and is up to 4× faster than HapCol in the case of Roche/454 instances and up to 20× faster when compared on the PacBio RS II dataset. Finally, we assessed the performance of GenHap on two different real datasets. CONCLUSIONS: Future-generation sequencing technologies, producing longer reads with higher coverage, can highly benefit from GenHap, thanks to its capability of efficiently solving large instances of the haplotype assembly problem. Moreover, the optimization approach proposed in GenHap can be extended to the study of allele-specific genomic features, such as expression, methylation and chromatin conformation, by exploiting multi-objective optimization techniques. The source code and the full documentation are available at the following GitHub repository: https://github.com/andrea-tango/GenHap

    Integration of scATAC-Seq with scRNA-Seq Data

    No full text
    : Single-cell studies are enabling our understanding of the molecular processes of normal cell development and the onset of several pathologies. For instance, single-cell RNA sequencing (scRNA-Seq) measures the transcriptome-wide gene expression at a single-cell resolution, allowing for studying the heterogeneity among the cells of the same population and revealing complex and rare cell populations. On the other hand, single-cell Assay for Transposase-Accessible Chromatin using sequencing (scATAC-Seq) can be used to define transcriptional and epigenetic changes by analyzing the chromatin accessibility at the single-cell level. However, the integration of multi-omics data still remains one of the most difficult tasks in bioinformatics. In this chapter, we focus on the combination of scRNA-Seq and scATACSeq data to perform an integrative analysis of the single-cell transcriptome and chromatin accessibility of human fetal progenitors
    corecore